Semantic Spaces based on Free Association that Predict Memory Performance

نویسندگان

  • Mark Steyvers
  • Richard M. Shiffrin
  • Douglas L. Nelson
چکیده

Many memory models represent aspects of words such as meaning by vectors of feature values, such that words with similar meanings are placed in similar regions of the semantic space whose dimensions are defined by the vector positions. Methods for constructing such spaces include those based on scaling similarity ratings for pairs of words, and those based on the analysis of co-occurrence statistics of words in contexts (Landauer & Dumais, 1997). We utilized a Word Association Space (WAS), based on a scaling of a large data base of free word associations: Words with similar associative structures were placed in similar regions of the high dimensional semantic space. In comparison to LSA and other measures based on associative strength, we showed that the similarity structure in WAS is well suited to predict similarity ratings in recognition memory, percentage correct responses in cued recall and intrusion rates in free recall. We suggest that the WAS approach is a useful and important new tool in the workshop of theorists studying semantic effects in episodic memory. An increasingly common assumption of theories of memory is that the meaning of a word can be represented by a vector which places a word as a point in a multidimensional semantic space (Bower, 1967; Landauer & Dumais, 1997; Lund & Burgess, 1996; Morton, 1970; Norman, & Rumelhart, 1970; Osgood, Suci, & Tannenbaum, 1957; Underwood, 1969; Wickens, 1972). The main requirement of such spaces is that words that are similar in meaning are represented by similar vectors. Representing words as vectors in a multidimensional space allows simple geometric operations such as the Euclidian distance or the angle between the vectors to compute the semantic (dis)similarity between arbitrary pairs or groups of words. This representation makes it possible to make predictions about performance in psychological tasks where the semantic distance between pairs or groups of words is assumed to play a role. One recent framework for placing words in a multidimensional space is Latent Semantic Analysis or LSA (Derweester, Dumais, Furnas, Landauer, & Harshman, 1990; Landauer & Dumais, 1997; Landauer, Foltz, & Laham, 1998). The main assumption is that similar words occur in similar contexts with context defined as any connected set of text from a corpus such as an encyclopedia, or samples of texts from textbooks. For example, a textbook with a paragraph about “cats” might also mention “dogs”, “fur”, “pets” etc. This knowledge can be used to assume that “cats” and “dogs” are related in meaning. However, some words are clearly related in meaning such as “cats” and “felines” but they might never occur simultaneously in the same context. Such words are related primarily through indirect links because they share similar contexts. The technique of singular value decomposition (SVD) can be applied to the matrix of word-context co-occurrence statistics. In this procedure, the direct and indirect relationships between words and contexts in the matrix are analyzed with simple matrix-algebraic operations and the result is a high dimensional space in which words that appear in similar contexts are placed in similar regions of the space. Landauer and Dumais (1997) applied the LSA approach to over 60,000 words appearing in over 30,000 contexts of a large encyclopedia. More recently, LSA was applied to over 90,000 words appearing in over 37,000 contexts of reading material that an English reader might be exposed to from 3 grade up to 1 year of college from various sources such as textbooks, novels, and newspaper articles. The SVD method placed these words in a high dimensional space with the number of dimensions

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A predictive framework for evaluating models of semantic organization in free recall.

Research in free recall has demonstrated that semantic associations reliably influence the organization of search through episodic memory. However, the specific structure of these associations and the mechanisms by which they influence memory search remain unclear. We introduce a likelihood-based model-comparison technique, which embeds a model of semantic structure within the context maintenan...

متن کامل

Graph-Theoretic Properties of Networks Based on Word Association Norms: Implications for Models of Lexical Semantic Memory

We compared the ability of three different contextual models of lexical semantic memory (BEAGLE, Latent Semantic Analysis, and the Topic model) and of a simple associative model (POC) to predict the properties of semantic networks derived from word association norms. None of the semantic models were able to accurately predict all of the network properties. All three contextual models over-predi...

متن کامل

Exploring the Relationship between Semantic Spaces and Semantic Relations

This study examines the relationship between two kinds of semantic spaces — i.e., spaces based on term frequency (tf) and word cooccurrence frequency (co) — and four semantic relations — i.e., synonymy, coordination, superordination, and collocation — by comparing, for each semantic relation, the performance of two semantic spaces in predicting word association. The simulation experiment demons...

متن کامل

Topics in semantic association

Learning and using language requires retrieving concepts from memory in response to an ongoing stream of information. The human memory system solves this problem by using the gist of a sentence, conversation, or document to predict related concepts and disambiguate words. Two approaches to representing gist have dominated research on semantic representation: semantic networks and semantic space...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000